skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Roy, Nirupam"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Integrating spatial context into large language models (LLMs) has the potential to revolutionize human-computer interaction, particularly in wearable devices. In this work, we present a novel system architecture that incorporates spatial speech understanding into LLMs, enabling contextually aware and adaptive applications for wearable technologies. Our approach leverages microstructure-based spatial sensing to extract precise Direction of Arrival (DoA) information using a monaural microphone. To address the lack of existing dataset for microstructure-assisted speech recordings, we synthetically create a dataset called OmniTalk by using the LibriSpeech dataset. This spatial information is fused with linguistic embeddings from OpenAI’s Whisper model, allowing each modality to learn complementary contextual representations. The fused embeddings are aligned with the input space of LLaMA-3.2 3B model and fine-tuned with lightweight adaptation technique LoRA to optimize for on-device processing. 
    more » « less
    Free, publicly-accessible full text available July 13, 2026
  2. Localization of networked nodes is an essential problem in emerging applications, including first-responder navigation, automated manufacturing lines, vehicular and drone navigation, asset tracking, Internet of Things, and 5G communication networks. In this paper, we present Locate3D, a novel system for peer-to-peer node localization and orientation estimation in large networks. Unlike traditional range-only methods, Locate3D introduces angle-of-arrival (AoA) data as an added network topology constraint. The system solves three key challenges: it uses angles to reduce the number of measurements required by 4X and jointly uses range and angle data for location estimation. We develop a spanning-tree approach for fast location updates, and to ensure the output graphs are rigid and uniquely realizable, even in occluded or weakly connected areas. Locate3D cuts down latency by up to 75% without compromising accuracy, surpassing standard range-only solutions. It has a 0.86 meter median localization error for building-scale multi-floor networks (32 nodes, 0 anchors) and 12.09 meters for large-scale networks (100,000 nodes, 15 anchors). 
    more » « less
    Free, publicly-accessible full text available April 28, 2026
  3. This demonstration presents LiTEfoot, an ultra-low power localization system leveraging ambient cellular signals. To address the limitations of traditional GPS-based tracking systems in terms of power consumption and latency, LiTEfoot employs a non-linear transformation of the cellular spectrum to achieve efficient self-localization. Our design uses a simple envelope detector to realize spectrum folding, enabling the identification of multiple active base stations. 
    more » « less
    Free, publicly-accessible full text available December 4, 2025
  4. In this paper, we introduce a low-power wide-area cellular localization system, called LiTEfoot. The core architecture of the radio carefully applies non-linear transform of the entire cellular spectrum to obtain a systematic superimposition of the synchronization signals at the baseband. The system develops methods to simultaneously identify all the base stations that are active at any cellular band from the transformed signal. The radio front end uses a simple envelop detector to realize the non-linear transformation. We build on this low-power radio to implement a self-localization system leveraging ambient 4G-LTE signals. We show that the core system can also be extended to other cellular technologies like 5G-NR and NB-IoT. The prototype achieves a median localization error of 22 meters in urban areas and 50 meters in rural areas. It can sense a 3GHz wideband LTE spectrum in 10ms using non-linear intermodulation while consuming 0.9 mJ of energy for a PCB-based implementation and 40 𝜇J for CMOS simulation. In other words, LiTEfoot tags can last for 11 years on a coin cell while continuously estimating location every 5 seconds. We believe that LiTEfoot will have widespread implications in city-scale asset tracking and other location-based services. The radio architecture can be useful beyond low-power self-localization and can find application in synchronization and communication on battery-less platforms. 
    more » « less
    Free, publicly-accessible full text available November 4, 2025
  5. This paper presents LiTEfoot, an ultra-low power, wide-area localization system leveraging ambient cellular signals to address the limitations of traditional self-localization systems in terms of power consumption and latency. LiTEfoot uses a non-linear transformation of the cellular synchronization signal to efficiently achieve self-localization by systematically superimposing signals at the baseband. A simple envelope detector is used to realize this non-linear transformation, enabling the identification of multiple active base stations across any cellular band. The system is designed to operate with low power, consuming only 40 𝜇Joules of energy per localization update, achieving a median localization error of 22 meters in urban areas. 
    more » « less
    Free, publicly-accessible full text available November 4, 2025
  6. Estimation of a speaker’s direction and head orientation with binaural recordings can be a critical piece of information in many real-world applications with emerging ‘earable’ devices, including smart headphones and AR/VR headsets. However, it requires predicting the mutual head orientations of both the speaker and the listener, which is challenging in practice. This paper presents a system for jointly predict- ing speaker-listener head orientations by leveraging inherent human voice directivity and listener’s head-related transfer function (HRTF) as perceived by the ear-mounted microphones on the listener. We propose a convolution neural network model that, given binaural speech recording, can predict the orientation of both speaker and listener with re- spect to the line joining the two. The system builds on the core observation that the recordings from the left and right ears are differentially affected by the voice directivity as well as the HRTF. We also incorporate the fact that voice is more directional at higher frequencies compared to lower frequen- cies. Our proposed system achieves 2.5 degrees of 90th percentile error in the listener’s head orientation and 12.5 degrees of 90th percentile error for that of the speaker. 
    more » « less
  7. This paper presents the design and implementation of Scribe, a comprehensive voice processing and handwriting interface for voice assistants. Distinct from prior works, Scribe is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Scribe can be used for 3D free-form drawing, writing, and motion tracking for gaming. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. Scribe prototype achieves 73 μm of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%. 
    more » « less
  8. Broadband infrastructure in urban parks may serve crucial functions including an amenity to boost overall park use and a bridge to propagate WiFi access into contiguous neighborhoods. This project: SCC:PG Park WiFi as a BRIDGE to Community Resilience has developed a new model —Build Resilience through the Internet and Digital Greenspace Exposure, leveraging off-the-shelf WiFi technology, novel algorithms, community assets, and local partnerships to lower greenspace WiFi costs. This interdisciplinary work leverages: computer science, information studies, landscape architecture, and public health. Collaboration methodologies and relational definitions across disciplines are still nascent —especially when paired with civic-engaged, applied research. Student researchers (UG/Grad) are excellent partners in bridging disciplinary barriers and constraints. Their capacity to assimilate multiple frameworks has produced refinements to the project’s theoretical lenses and suggested novel socio-technical methodology improvements. Further, they are excellent ambassadors to community partners and stakeholders. In BRIDGE, we tested two mechanisms to augment student research participation. In both, we leveraged a classic, curriculum-based model named the Partnership for Action Learning in Sustainability program (PALS). This campus-wide, community-engaged initiative pairs faculty and students with community partners. PALS curates economic, environmental, and social sustainability challenges and scopes projects to customize appropriate coursework that addresses identified challenges. Outcomes include: literature searches, wireframes, and design plans that target solutions to civic problems. Constraints include the short semester timeframe and curriculum-learning-outcome constraints. (1) On BRIDGE, Dr. Kweon executed a semester-based Landscape Architecture PALS 400-level-studio. 18 undergraduates conducted in-class and in-field work to assess community needs and proposed design solutions for future park-wide WiFi. Research topics included: community-park history, neighborhood demographics, case-study analysis, and land-cover characteristics. The students conducted an in-Park, community engagement session —via interactive posterboard surveys, to gain input on what park amenities might be redesigned or added to promote WiFi use. The students then produced seven re-design plans; one included a café/garden, with an eco-corridor that integrated technology with nature. (2) From the classic, curriculum-based PALS model we created a summer-intensive for our five research assistants, to stimulate interdisciplinary collaboration in their research tasks and co-analysis of project data products: experimental technical WiFi-setup, community survey results, and stakeholder needs-assessments. Students met weekly with each other and team leadership, exchanged journal articles, and attended joint research events. This model shows promise for integrating students more formally into an interdisciplinary research project. An end-of-intensive focus group highlighted, from the students’ perspective, the pro/cons of this model. Results: In contrasting the two mechanisms, our results include: Model 1 is tried-and-trued and produces standardized, reliable products. However, as work is group based, student independence is limited —to explore topics/themes of interest. Civic groups are typically thrilled with the diversity of action plans produced. Model 2 provides greater independence in student-learning outcomes, fosters interdisciplinary, “dictionary-building” that can be used by the full team, deepens methodological approaches, and allows for student stipend payments. Lessons learned: intensive time frame needed more research team support and ideally should be extended, when possible, over the full project-span. UMD-IRB#1785365-4; NSF-award: 2125526. 
    more » « less
  9. ABSTRACT Measuring interstellar magnetic fields is extremely important for understanding their role in different evolutionary stages of interstellar clouds and star formation. However, detecting the weak field is observationally challenging. We present measurements of the Zeeman effect in the 1665 and 1667 MHz (18 cm) lines of the hydroxyl radical (OH) lines towards the dense photodissociation region (PDR) associated with the compact H ii region DR 21 (Main). From the OH 18 cm absorption, observed with the Karl G. Jansky Very Large Array, we find that the line-of-sight magnetic field in this region is ∼0.13 mG. The same transitions in maser emission towards the neighbouring DR 21(OH) and W 75S-FR1 regions also exhibit the Zeeman splitting. Along with the OH data, we use [C ii] 158 μm line and hydrogen radio recombination line data to constrain the physical conditions and the kinematics of the region. We find the OH column density to be ∼3.6 × 1016(Tex/25 K) cm−2, and that the 1665 and 1667 MHz absorption lines are originating from the gas where OH and C+ are co-existing in the PDR. Under reasonable assumptions, we find the measured magnetic field strength for the PDR to be lower than the value expected from the commonly discussed density–magnetic field relation while the field strength values estimated from the maser emission are roughly consistent with the same. Finally, we compare the magnetic field energy density with the overall energetics of DR 21’s PDR and find that, in its current evolutionary stage, the magnetic field is not dynamically important. 
    more » « less
  10. null (Ed.)
    ABSTRACT The extreme ultraviolet region (EUV) provides most of the ionization that creates the high equivalent width (EW) broad and narrow emission lines (BELs and NELs) of quasars. Spectra of hypermassive Schwarzschild black holes (HMBHs; MBH ≥ 1010 M⊙) with α-discs, decline rapidly in the EUV suggesting much lower EWs. Model spectra for BHs of mass 106–1012 M⊙ and accretion rates 0.03 ≤ Lbol/LEdd ≤ 1.0 were input to the cloudy photoionization code. BELs become ∼100 times weaker in EW from MBH ∼ 108 M⊙ to MBH ∼ 1010 M⊙. The high-ionization BELs (O vi 1034 Å, C iv 1549 Å, and He ii 1640 Å) decline in EW from MBH ≥ 106 M⊙, reproducing the Baldwin effect, but regain EW for MBH ≥ 1010 M⊙. The low-ionization lines (Mg ii 2798 Å, H β 4861 Å, and H α 6563 Å) remain weak. Lines for maximally spinning HMBHs behave similarly. Line ratio diagrams for the BELs show that high O vi/H β and low C iv/H α may pick out HMBH, although O vi is often hard to observe. In NEL BPT diagrams, HMBHs lie among star-forming regions, except for highly spinning, high accretion rate HMBHs. In summary, the BELs expected from HMBHs would be hard to detect using the current optical facilities. From 100 to 1012 M⊙, the emission lines used to detect active galactic nuclei (AGNs) only have high EW in the 106–109 M⊙ window, where most AGNs are found. This selection effect may be distorting reported distributions of MBH. 
    more » « less